STATISTICAL  INTERPRETATION 


CANADIAN  HYDROMETRIC  DATA 


erence 


NOT  TO  BE  T '  .KEN  FROM  THIS  ROOM 


Sx  mm 
arawmais 

WBWinias 


Digitized  by  the  Internet  Archive 
in  2018  with  funding  from 
University  of  Alberta  Libraries 


https://archive.org/details/ansley1959 


°\  TT\ 


THE  UNIVERSITY  OF  ALBERTA 


#  I. 


STATISTICAL  INTERPRETATION  OF  CANADIAN  HYDROMETRIC  DATA 


A  DISSERTATION 

SUBMITTED  TO  THE  FACULTY  OF  GRADUATE  STUDIES 
IN  PARTIAL  FULFILMENT  OF  THE  REQUIREMENTS  FOR  THE  DEGREE 
OF  MASTER  OF  SCIENCE 


DEPARTMENT  OF  CIVIL  ENGINEERING 
by 

Ralph  W.  Ansley,  B.A.,  B.Sc. 


EDMONTON,  ALBERTA 
APRIL,  1959 


ABSTRACT 


The  initiation  of  a  water  resources  inventory  for  the  Province 
of  Alberta  produced  many  problems  in  the  interpretation  of  available  data. 
This  thesis  discusses  the  feasibility  of  adapting  standard  hydrologic  inter¬ 
pretation  methods  to  the  data.  Frequency  analysis  is  presented  as  the  most 
practical  approach  and  a  logarithmic  normal  fitting  curve  is  advocated. 

The  hypothesis  is  justified  statistically  and  statistical  inter¬ 
pretation  of  the  data  is  presented  in  a  nomograph  form  to  standardize 
calculations.  Confidence  bands  are  discussed  and  nomographs  have  been 
prepared  for  the  construction  of  these  bands. 


•j'  :  , 


ACKNOW  LE  DGMENTS 


The  author  wishes  to  extend  his  appreciation  to: 

The  Government  of  Alberta  Water  Resources  Branch  under  whose  sponsor¬ 
ship  the  program  was  carried  out, 

Professor  T.  Blench  for  his  helpful  criticism  and  guidance  throughout, 
Assistant  Professor  J.  P.  Verschuren  for  his  interest  and  valuable  informa¬ 


tion  concerning  hydrometric  data. 


TABLE  OF  CONTENTS 


Page 

INTRODUCTION  1 

TEXT 

1.  River  Discharge  Data  3 

2.  The  Nature  of  a  Frequency  Analysis  3 

3.  The  Use  of  Frequency  Analysis  5 

4.  The  reliability  of  a  Smoothing  Curve  Within  Range  of  Data  5 

4(a).  The  Extrapolation  of  a  Smoothing  Curve  8 

5.  Value  of  Some  Smoothing  Curves  in  Assisting  Extrapolation  9 

6.  The  Normal  Fitting  Curve  9 

7.  The  Logarithmic  Normal  Curve  10 

8.  Other  Curves  11 

9.  A  Practical  Utility  of  the  Log  Normal  Curve  11 

10.  Reduction  of  Gauge  Data  to  Suit  Log  Prob.  Test  12 

11.  Validity  of  Log  Prob.  Fitting  Curve  for  Annual  Peak  Floods  12 

12.  The  Use  of  Confidence  Bands  14 

13.  Method  of  Fitting  Log  Prob.  Curve  to  Data  14 

14 .  Conclusions  15 

15.  Recommendations  for  Further  Studies  15 

APPENDIX  1 

CONSTRUCTION  OF  HISTOGRAMS  AND  METHOD  OF 
VARIATE  INTERVAL  MANIPULATION 

Analysis  of  the  Sample  17 

Histogram  17 

Histogram  of  the  Logarithms  of  X  27 


' 


TABLE  OF  CONTENTS  (cont’d) 

Page 

Change  of  Class  Interval  27 

APPENDIX  2 

DERIVATION  OF  MEAN  AND  STANDARD  DEVIATION  OF  LOG 
NORMAL  DISTRIBUTION  FROM  MEAN  AND  STANDARD  DEVI  - 
ATION  OF  SAMPLE 

“\ 

Log  Normal  Distribution  28 

Calculation  of  Mean  and  Standard  Deviation  of  Observed 

Distribution  30 

APPENDIX  3 

CHI  SQUARE  TEST  FOR  LOG  NORMAL  FIT  OF  ANNUAL 
MAXIMUM  DISCHARGE  NORTH  SASKATCHEWAN  RIVER 

Chi-Square  Test  33 

APPENDIX  4 

INSTRUCTION  FOR  FITTING  LOG  NORMAL  LINE 
TO  DATA 

Plotting  Data  36 

Calculation  of  Best  Fitting  Line  36 

APPENDIX  5 

INSTRUCTIONS  FOR  FITTING  CONFIDENCE  BANDS 
TO  BEST  LOG  NORMAL  FIT 

Sampling  Distribution  42 

Calculation  of  Standard  Errors  of  Percentiles 

for  Normal  Sampling  Distribution  42 


REFERENCES 


47 


LIST  OF  FIGURES 


Page 

Fig.  1  Histogram  4 

Fig.  2  Log  Prob.  Plot  of  Annual  Maximum  Discharges  for  the  North 

Saskatchewan  River  at  Edmonton,  Years  1911  -  1957  6 

Fig.  3  Log  Prob.  Plot  of  Annual  Maximum  Discharges  for  the  North 
Saskatchewan  River  at  Edmonton,  Years  1911  -  1914  and 
1916  -  1957.  7 

Fig.  4  Log  Prob.  Plot  of  Annual  Maximum  Discharges  for  the  North 

Saskatchewan  River  at  Edmonton,  fitted  with  68%  Confidence 
Bands,  Years  1911  -  1957.  13 

Fig.  5  Histogram  of  Annual  Maximum  Discharges  for  the  North 

Saskatchewan  River  at  Edmonton  with  Variate  Class  Interval 
of  5,000  cusecs.  22 

Fig.  6  Histogram  of  Logarithms  of  Annual  Maximum  Discharges  for 

the  North  Saskatchewan  River  at  Edmonton  with  variate  class 
Interval  of  5,000  cusecs.  23 

Fig.  7  Histogram  of  Annual  Maximum  Discharges  for  the  North 

Saskatchewan  River  at  Edmonton  with  Variate  Class  Interval 
of  10,000  cusecs.  25 

Fig.  8  Histogram  of  Logarithms  of  Annual  Maximum  Discharges  for 

the  North  Saskatchewan  River  at  Edmonton  with  variate  class 
Interval  of  10,000  cusecs.  26 

Fig.  9  Log  Prob.  Plot  of  Annual  Maximum  Discharges  for  the  North 

Saskatchewan  River  at  Edmonton  years  1911  -  1957.  38 

Fig.  10  Illustration  of  Best  Fit.  41 

Fig.  11  Construction  of  Confidence  Bands.  45 


LIST  OF  TABLES 


Page 

Table  1  Maximum  Annual  Discharges  North  Saskatchewan  River 

at  Edmonton,  1911  -  1957.  19 

Table  2  Maximum  Annual  Discharges  North  Saskatchewan  River 

at  Edmonton  in  order  of  magnitude.  20 

Table  3  Frequency  of  Discharges  per  5,000  cusecs  interval.  21 

Table  4  Frequency  of  discharges  per  10,000  cusecs  interval.  24 

Table  5  Tabular  calculation  of  mean  and  standard  deviation  of 

sample  using  data  of  North  Saskatchewan  River  at 
Edmonton.  31 

Table  6  Tabular  calculation  of  Chi  square  test  for  log  normal 

fit  of  annual  maximum  discharges  at  Edmonton.  34 

Table  7  Frequency  Analysis  of  North  Saskatchewan  at  Edmonton.  37 

Tabel  8  CT  Interval.  40 

Table  9  Standard  Error  Interval. 


44 


INTRODUCTION 


The  measured  quantities  on  which  a  water  resources  estimate  for  a 
given  area  must  depend  are  the  observed  river  discharges  and  observed  precipi¬ 
tations  at  widely  scattered  points.  However,  river  discharges  and  the  precipi¬ 
tations  which  cause  them  are  widely  variable  over  long  periods  of  time.  For 
example,  the  annual  precipitation  at  Edmonton  may  be  expected  to  have  at  least 
double  its  median  annual  value  twice  during  a  century  and  not  more  than  half  the 
median  value  with  about  the  same  frequency.  Although  the  North  Saskatchewan 
River  at  Edmonton  for  example,  has  a  range  of  approximately  three  times  to 
one-third  median  discharge  with  comparable  frequency,  smaller  Albertan  rivers 
can  show  a  range  from  ten  times  to  one -tenth.  Thus,  with  the  short  term  records 
available,  the  problem  of  predicting  the  range  of  river  discharges  is  essentially 
one  of  calculating  probabilities  of  occurrences  in  a  large  population  from  a  small 
sample . 

While  engaged  on  the  initial  planning  of  a  water  resources  investigation 
for  Alberta  being  carried  out  by  the  Alberta  Government  Water  Resources  Dept. , 
the  author  was  impressed  by  the  limited  use  of  statistical  methods  of  frequency 
distributions  and  failure  to  use  statistical  mathematics  to  evaluate  such  distribu¬ 
tions.  Such  limitations  are  particularly  unfortunate  in  Alberta  since  precipitation 
records  are  scanty  or  nonexistent.  There  are  only  two  recording  rain  gauges  in 
the  Province  and  approximately  one  rain  gauge  per  1,350  square  miles  which  give 
year  round  percipitations,  whereas  average  rainfall  station  coverage  of  the  U.S.A. 
is  about  one  per  275  square  miles  of  which  three-eighths  are  the  recording  type 


2 


(Ref.  7,  page  72).  Therefore  river  discharge  records  give  the  only  usable  data 
and  even  these  data  are  quite  scanty  in  the  northern  half  of  the  Province.  Accord¬ 
ingly,  this  thesis  attempts  to  show  why  and  how  the  extended  use  of  frequency 
analysis,  using  some  statistical  mathematics,  could  extract  more  information  from 
existing  data  with  better  evaluation. 

Although  the  objectives  and  conclusions  of  this  thesis  concern  practical 
hydrologists  and  civil  engineers  engaged  in  hydrologic  studies  and  can  be  appreci¬ 
ated  without  specialized  mathematics,  the  detailed  reasons  and  practical  methods 
must  be  explained  in  terms  of  statistics  which  is  a  highly  specialized  branch  of 
mathematics.  Therefore  the  author  has  presented  the  thesis  in  two  parts.  The 
main  body  of  the  thesis  is  intended  to  present  essential  ideas  in  a  non-mathematical 
discussion.  The  mathematical  arguments  and  instructions  on  use  of  mathematical 
methods  are  outlined  in  a  series  of  appendices. 


. 

. 


: 


.  :  '  ;  .  :  '  '  '  . 

.  .  .  '  .  •  :  :  ;'T  .  1  >: 

-  '  .  : 


3 


TEXT 


1.  River  Discharge  Data 

The  data  discussed  and  processed  in  this  thesis  are  alleged  daily  river 
discharges  at  selected  gauging  points.  These  data  are  published  in  the  annual 
Water  Resources  Papers  by  Water  Resources  Division,  Department  of  Resources 
and  Development,  Government  of  Canada,  Ottawa,  Canada.  The  reported  discharges 
are  normally  the  daily  means  as  deduced  from  gauge  records  which  may  or  may  not 
be  continuously  recorded.  The  gauge  readings  are  related  to  discharge  by 
occasional  field  discharge  observations  which  are  used  to  construct  discharge 
rating  curves.  However,  amendment  must  be  made  if  river  regime  is  changing  or 
the  gauge  zero  is  disturbed.  The  absolute  accuracy  of  a  discharge  observation  is 
doubtful  but  is  probably  90%  under  favourable  conditions.  Rating  curves  prepared 
from  observations  under  ice  must  be  interpreted  cautiously.  High  flood  discharges 
are  seldom  observed  and  are  usually  estimated  by  speculative  methods  such  as 
extrapolation.  Such  estimates  should  be  considered  as  being  liable  to  approximately 
20%  error.  In  engineering  practice  the  reported  data  must  be  analysed  but  their 
approximate  nature  and  the  possibility  of  isolated  major  errors  should  be  kept  in 
mind. 

2.  The  Nature  of  a  Frequency  Analysis 

Essentially  the  simple  engineering  operation  of  frequency  analysis  consists 
of  organizing  observed  values  of  variates  *  into  a  histogram  or  a  distribution  curve. 


*  Variate  is  a  variable  for  which  a  distribution  function  exists,  (Ref.  5,  page  25). 


I 


di±L 


I 


5 

Figure  1  shows  a  histogram  in  which  frequency  per  unit  variate  interval  is  plotted 
against  the  variate.  The  area  of  any  block  (e.g.  the  shaded  area  on  figure  1) 
represents  the  proportion  of  observations  which  lie  within  the  range  represented 
by  the  base  of  the  block.  The  distribution  curve  represents  (a)  the  proportions  of 
total  observations  having  a  size  greater  than  a  series  of  stated  magnitudes  which 
are  plotted  against  (b)  these  magnitudes.  Obviously  the  distribution  curve  is  the 
integral  curve  of  the  histogram . 

3 .  The  Use  of  Frequency  Analysis 

After  data  have  been  plotted  in  either  standard  manner  they  are  fitted 
with  a  smoothing  curve  which  is  presumed  to  show  what  would  have  been  obtained 
from  a  far  larger  number  of  data  than  were  available,  provided  that  conditions 
affecting  data  remain  constant.  The  fitting  curve  is  then  used: 

(a)  within  the  limits  of  data  to  indicate  frequency  with  which  various 

values  of  the  variate  may  be  expected  to  be  exceeded  on  an  average 

over  a  long  time , 

(b)  extrapolated  beyond  the  limits  of  the  data  for  the  same  purpose. 

4.  The  Reliability  of  a  Smoothing  Curve  Within  the  Range  of  Data 

Statistical  mathematics  can  be  used  to  show  reliability  of  a  smoothing 
curve  even  if  observed  values  deviate  only  slightly  from  it.  Common  sense  tests 
will  also  give  a  measure  of  reliability.  However,  many  workers  (Refs.  2,  4,  7) 
appear  unaware  that  there  is  a  problem  which  suggests  that  there  is  a  general  wrong 
opinion  that  a  good  curve  fit  may  be  taken  as  a  close  approximation  to  truth. 

As  a  common  sense  illustration,  suppose  that  the  exceedingly  high  top 
flood  of  the  record  on  which  figure  2  was  based  had  been  omitted  due  to  discontin¬ 
uation  of  records  or  having  taken  place  just  outside  record  duration.  Figure  3 


f  ' 
\  / 


t 


) 


99.99  99.9  99.8  99.5  99  _ 98 _ 95  90 _ 80  70  60  50  40  30  20 _ 10 _ 5 _ 2  1  0,5  0.2  0.1  0.05  0.01 


6 


0.01  0.05  0.1  0.2  0.5  1  2  5  10  20  30  40  50  60  70  80  90  95  98  99  99.5  99.8  99.9  99.99 


99.99  99.9  99.8  99.5  99  98  95  90 _ 80  70  60  50  40  30  20 _ 10 _ 5 _ 2  1  0,5  0.2  0,1  0.05  _ 0.01 


7 


0.01  0.05  0.1  0.2  0.5  1  2  5  10  20  30  40  50*  60  70  80  90  95  98  99  99.5  99.8  99.9  99.99 


8 


shows  the  record  replotted  without  this  top  flood.  Notice  that  figure  2  shows  an 
upward  trend  curve  fit  with  a  50-year  flood  of  approximately  180,000  cusecs  while 
figure  3  shows  a  straight  line  fit  with  a  50  year  flood  of  approximately  120,000 
cusecs;  both  curve  and  line  fit  their  data  excellently. 

A  distribution  curve,  being  of  the  integral  type,  smooths  out  considerable 
irregularities  of  the  histogram  from  which  it  was  obtained  and  thus  mere  smooth¬ 
ness  does  not  prove  validity  at  the  ends  where  insufficient  data  mean  unreliable 
answers.  This  will  be  discussed  mathematically  in  paragraph  11. 

4(a)The  Extrapolation  of  a  Smoothing  Curve 

Since  the  validity  of  a  smoothing  curve  within  the  range  of  its  data  may 
be  poor  near  the  ends,  extrapolation  should  be  performed  with  extreme  caution.  This 
has  been  emphasized  by  the  A.S.C.E.  Subcommittee  on  Review  of  Flood  Frequency 
Methods  (Ref.  9).  However,  the  practicing  engineer  is  vitally  interested  in  happen¬ 
ings  that  may  occur  at  long  intervals  say  50,  100  or  even  500  years.  Available 
records  seldom  exceed  50  years.  The  standard  method  of  extrapolation  is  to  fit 
the  data  of  a  distribution  curve  by  statistical  or  other  formulae  rather  than  by  eye. 
Then  all  interpretations  of  the  data  will  be  the  same  and  permit  standardized  design 
criteria.  For  example,  it  may  be  agreed  that  very  large  dams,  whose  failure  would 
be  a  catastrophe,  should  be  designed  to  a  1,000-year  flood.  This  does  not  imply 
acceptance  of  this  value  as  a  true  measure  of  such  a  rare  event  but  it  does  purport 
to  put  a  value  on  the  event  outside  the  range  of  practical  possibility.  Similarly 
it  might  be  agreed  to  design  less  important  hydraulic  structures  for  a  100-year 
flood  which  is  within  the  order  of  a  formal  extrapolation.  Of  course  this  criterion 
must  be  accepted  as  an  economic  risk  even  after  allowing  for  error  in  extrapo¬ 


lation  . 


.  ..  :  y 


.  ..  ■  .■■■  i  :  :  x  ■  '  '  -  ' 


■ 


:■ 


■ 


9 


5.  Value  of  Some  Smoothing  Curves  In  Assisting  Extrapolation 

There  is  a  possibility  that  certain  types  of  hydrologic  data  might  be 
fitted  by  standard  statistical  curves  having  a  mathematically  deducible  origin.  If 
the  data  are  consistently  well  fitted,  there  is  a  high  probability  that  physical  con¬ 
ditions  involved  are  expressed  satisfactorily  in  the  mathematics.  Such  a  curve 
would  be  considered  as  having  physical  validity  and  therefore  would  give  extrapo¬ 
lation  with  more  assurance.  Other  curves  may  fit  the  actual  data  but  without  physical 
validity  they  cannot  give  valid  estimates  outside  the  range  of  the  data.  Unfortunately, 
all  ordinary  statistical  curves  extend  to  infinity,  so  their  validity  is  limited. 

6.  The  "Normal"  ("Gaussian"  or  MError  Function'*)  Fitting  Curve 

Of  the  special  statistical  curves  which  reflect  physical  characteristics, 
the  most  outstanding  and  commonly  used  is  the  normal  curve  (Ref.  5).  The  histo¬ 
grams  of  many  different  variates  are  fitted  with  the  normal  curve.  Examples  of  such 
variates  are  small  errors  in  precision  measurements,  examination  marks  of  classes 
and  velocity  fluctuations  in  fully  turbulent  flow .  The  curve  can  be  deduced  from  a 
variety  of  postulates  of  which  one  interesting  theory  is  that  the  variate  owes  its 
value  to  the  summation  of  large  numbers  of  sub- variates  which  are  of  the  same 
order  of  magnitude  and  independent  (Ref.  8).  If  this  theory  is  applied  to  discharge 
data  we  could  imagine  rainfall  falling  at  random  on  the  catchment  in  small  quantities 
with  the  sum  of  these  quantities  being  the  discharge  at  a  gauging  site.  Supposing 
that  such  a  simple  picture  of  flood  causes  just  given  withstood  examination, 
considerable  assurance  could  be  accepted  in  extrapolation  of  a  normal  curve  fit  to 
actual  discharge  data.  However,  the  physical  limits  of  such  extrapolation  cannot 
be  calculated  since  the  curve  extends  to  infinity.  Some  river  flood  data  actually  are 


\ 

/ 


\ 

) 


)- 


fitted  by  the  normal  curve  fairly  well  but  such  occurrences  are  quite  rare.  The 
practical  test  for  normal  distribution  is  to  plot  data  on  standard  probability  paper 
and  see  if  a  straight  line  fit  results. 

7.  The  Logarith- Normal  Curve 

The  normal  curve  is  symmetrical  but  most  river  discharge  data  show  mark¬ 
edly  asymmetric  histograms  (Fig.  5).  A  standard  statistical  manipulation  to  remove 
this  asymmetry  or  skewness  is  to  replot  the  data  in  terms  of  the  logarithm  of  the 
variate.  Often  this  plot  is  fitted  very  well  by  the  normal  curve.  The  practical  test  is 
again  a  simple  plot  of  data  on  logarithmic  probability  paper.  This  test  was  carried  out 
by  Kuiper  (Ref.  6)  for  100  North  American  rivers  with  good  straight  line  fits  drawn 
by  eye.  On  the  basis  of  these  tests  he  concluded  that  maximum  annual  flood  flows 
could  be  fitted  with  logarithmic  normal  frequency  distributions.  In  subsequent 
discussion  of  his  paper  by  other  hydrologists  some  alleged  exceptions  were  produced. 
The  author  is  familiar  with  20  additional  rivers  in  Alberta  which  give  good  log¬ 
arithmic  normal  fits.  Other  variates  which  can  be  best  fitted  by  this  distribution 
are  number  of  silver  particles  in  photographic  emulsions,  survival  time  of 
bacteria  in  disinfectants  (Ref.  5)  and  particle  size  in  river  bed  sands  based  on 
weight  (Ref.  1).  The  physical  validity  of  the  log  normal  distribution  merits  comment 
but  no  opinions  seem  to  have  been  published.  As  an  extension  of  the  discussion  of 
paragraph  6,  a  possible  mathematical  explanation  is  that  the  sub -contributions  of 
a  discharge  are  multiplied  rather  than  added  since  logarithm  addition  is  actually 
multiplication  of  variate  values.  For  high  floods  this  could  be  expressed  as  per¬ 
centage  infiltration  decreasing  and  flood  wave  velocity  increasing  with  increased 
precipitation.  The  author  has  successfully  fitted  minimum  discharges  with  the 


\ 

/ 


)' 


)  it 


11 


log  probability  distribution  but  more  quantitative  work  is  necessary. 

8.  Other  Curves 

Other  standard  statistical  curves  which  have  been  used  on  discharge 
data  are  Pearson  IE  (Ref.  5)  Gamma  and  Poisson.  The  Poisson  curve  in  particular 
could  be  used  in  the  future  to  analyse  small  probabiHty  discharges  when  sufficient 
data  are  available.  The  United  States  Geological  Survey  is  using  a  non-standard 
distribution  (invented  by  Gumbel,  Ref.  3)  with  considerable  success  on  many  rivers. 
The  mathematical  derivation  of  this  curve  is  not  understandable  for  most  engineers 
and  any  mathematical  manipulations  of  it  are  extremely  cumbersome.  Other  than 
the  fact  that  the  curve  is  based  on  theory  of  extreme  values  the  curve  has  limited 
merit  since  it  lacks  explicable  physical  vaHdity. 

9 .  An  Outstanding  Practical  Utility  of  the  Log  Normal  Curve 

If  the  rating  curve  of  a  gauging  site  is  plotted  as  gauge  from  mean  bed  vs . 
discharge  it  will  give  a  straight  line  plot  on  double  log  paper.  This  can  be  explained 
by  Manning’s  equation  or  regime  theory  (Ref.  1)  for  conditions  of  non- spill  and  the 
slope  of  the  line  will  be  approximately  5/3  or  7/4.  For  spilling  conditions  a  slight 
adjustment  of  the  slope  wiU  give  good  approximations.  Therefore  the  plot  of  log¬ 
arithms  of  gauge  referred  to  mean  bed  elevation  will  differ  from  a  plot  of  corres¬ 
ponding  logarithms  of  discharge  only  by  a  change  of  scale.  With  rivers  which  follow 
the  log  normal  distribution  we  can  therefore  plot  logarithms  of  gauge  and  get  a 
straight  line  fit  on  log  probability  paper.  Thus  great  masses  of  gauge  records 
presently  available  and  considered  worthless  without  corresponding  discharges  can  be 
used  if  the  river  foUows  the  log  normal  distribution  and  river  regime  has  not  radicaUy 
changedo  One  rating  curve  will  convert  the  gauge  records  to  discharge  values. 


\  T~ 


12 


10.  Reduction  of  Gauge  Data  to  Suit  Log  Prob.  Test 

The  following  procedure  will  calculate  mean  bed  elevation  without  field 
surveys  and  permit  gauge  data  to  be  processed  as  in  paragraph  9. 

Let  the  discharges  at  the  highest  and  lowest  gauges  Gland  G2  which 
correspond  to  reliable  discharge  observations  be  and  Q2.  Assuming  Manning’s 
equation  to  be  reasonably  correct  and  taking  gauge  reading  at  mean  bed  elevation 
as  Go 


3/5 

(-Q2J  =  G2  -  G0 

(  QD  - — 

Gi  -  G0 

Solve  this  equation  to  find  a  first  approximation  to  Go.  Plot  the  rating  curve  for  the 
discharge  data  which  include  Qi  and  Q2  against  gauge  minus  Go  on  double  log  paper . 
If  a  straight  line  fit  results,  use  the  Go  calculated  but,  if  not,  find  by  trial  and  error 
the  fixed  amount  which,  when  deducted  from  gauge  minus  Go,  will  give  a  straight 
line  fit.  This  will  give  a  new  Go  which  can  be  used  to  reduce  gauge  data  to  values 
relative  to  mean  bed  level.  If  frequency  analysis  is  needed  where  rating  curves  have 
not  been  calculated,  field  methods  discussed  in  Reference  5  may  be  used  to  estimate 
mean  bed  level.  Frequency  analysis  of  gauge  in  such  cases  can  be  converted  to 
discharge  analysis  when  a  rating  curve  is  established. 

11.  Validity  of  Log  Prob.  Fitting  Curve  for  Annual  Peak  Floods 

The  discussion  of  paragraph  9  indicates  that  a  log  probability  treatment 
of  annual  flood  data  would  be  extremely  advantageous .  The  reasonableness  of  its 
use  has  been  demonstrated  by  other  workers  (Ref.  6,  10)  with  extensive  data 
testing.  However,  other  curves  such  as  the  Gumbel  curve  have  also  been  used 


successfully  (Ref.  2)  and  the  acceptance  of  log  normal  must  be  justified  statistically. 


1 


<E '  • 


> 


) 


99.99  99.9  99.8  99.5  99  98  95  90  80  70 _ 60 _ 50 _ 40 _  30^  20 _ 10 _ 5 _ 2  1  0.5  0.2  0.1  0.05  0.01 


IS 


0.01  0.05  0.1  0.2  0.5  1  2  5  10  20  30  40  50  60  70  80  90  95  98  99  99.5  99.8  99.9  99.99 


14 

A  standard  statistical  test  for  goodness  of  fit  of  a  theoretical  curve  is  the  Chi- 
square  test  (Ref.  4,  5).  The  author  has  applied  this  test  to  the  annual  peak  data 
of  the  North  Saskatchewan  River  at  Edmonton  during  the  years  1911  to  1957.  The 
results  of  the  test  indicate  good  fit  for  the  logarithmic  normal  curve  despite  the 
visibly  obvious  deviation  of  points  from  straight  line  fit  on  log  prob.  paper  (see 
figure  2).  The  actual  calculation  and  discussion  are  contained  in  appendix  3.  The 
particular  data  were  used  to  demonstrate  agreement  of  both  the  statistical  analysis 
and  common  sense  practical  test  of  paragraph  4. 

12.  The  Use  of  Confidence  Bands 

Accepting  the  logarithmic  normal  curve ,  the  next  statistical  analysis  is 
to  fit  the  curve  with  confidence  bands.  These  bands  are  a  formal  graphical  re¬ 
presentation  of  the  range  of  data  which  can  be  expected  with  a  given  probability. 

In  fact,  any  mathematical  representation  of  data,  such  as  logarithmic  probability, 
cannot  be  used  to  predict  exact  magnitude  of  future  events.  The  95%  confidence 
band  represents  the  interval  in  which  we  may  assume,  with  95%  confidence,  that 
the  data  will  be  contained.  The  data  for  the  North  Saskatchewan  River  at  Edmonton 
are  fitted  with  a  68%  confidence  band  in  figure  4.  The  variations  in  actual  data 
are  not  surprising  when  examined  with  reference  to  the  confidence  bands.  The 
construction  of  the  confidence  bands  and  references  to  the  statistical  methods 
employed  are  given  in  appendix  5 . 

13.  Method  of  Fitting  Log  Prob.  Curve  to  Data 

Since  the  log  probability  curve  is  of  great  utility  (paragraph  9)  and  is 
acceptable  both  practically  (paragraph  4)  and  statistically  (paragraph  11),  the  data 
will  be  fitted  with  this  curve.  The  method  of  best  mathematical  fit  is  used  rather 


)  •  ...  ' 


than  a  fit  by  eye  to  justify  the  mathematical  derivation  of  confidence  bands. 

Appendix  4  gives  a  mathematical  analysis  showing  how  the  mean  and  standard 
deviation  of  the  logarithmic  plot  can  be  computed  by  simple  formulae  from  the 
mean  and  standard  deviation  of  the  original  data.  This  is  followed  by  a  manual 
on  how  to  fit  the  confidence  bands  on  log  prob.  paper  using  the  mean  and  standard 
deviation  of  the  log  normal  curve. 

14 .  Conclusions 

In  terms  of  the  development  of  this  thesis,  the  author  recommends  the 
following  treatment  of  Alberta  hydrometric  data  in  estimating  water  resources: 

(1)  Frequency  curves  for  annual  maximum  and  annual  minimum  discharges 
should  be  plotted  on  log  prob.  paper  according  to  the  instructions  of 
Appendix  4.  Confidence  bands  at  a  68%  confidence  interval  should  be 
fitted  as  described  in  Appendix  5. 

(2)  Streams  of  interest  which  have  gauge  records  only  should  be  analysed 
by  converting  gauge  records  to  mean  depths  as  in  paragraph  10  and 
analysed  as  in  paragraph  14  (1)  above.  The  gauges  should  be  converted 
to  discharges  with  a  suitable  rating  curve. 

15.  Recommendations  for  Further  Studies 

Since  the  statistical  curves  are  used  only  to  fit  independent  variates 
the  methods  described  in  this  thesis  have  not  been  applied  as  yet  to  duration  volumetric 
studies.  Variates  such  as  prolonged  shortages  or  surpluses  of  flow  are  obviously 
dependent  and  have  been  discarded  on  this  basis.  The  author  feels  that  data  of  this 
nature  could  be  analysed  by  using  a  normal  curve  fit  and  he  is  presently  testing 
this  hypothesis.  The  proposal  is  based  on  the  Central  Limit  Theorem  of  statistics 


1  :  il!  :  it 


Hpi 


16 


which  states  in  exact  terms  that  the  distribution  of  means  of  samples  taken  from 
a  large  population  will  be  normal,  providing  sufficient  samples  are  taken,  although 
the  parent  population  may  have  a  non-normal  distribution.  If  the  hypothesis  can 
be  established,  Alberta  hydrometric  data  can  be  used  in  retention  reservoir  studies. 


17 


APPENDIX  I 

CONSTRUCTION  OF  HISTOGRAMS  AND  METHOD  OF  VARIATE 
INTERVAL  MANIPULATION 

We  will  consider  nature  as  having  an  infinite  number  of  river  discharges 
which  will  be  defined  as  the  population.  The  method  of  predicting  the  behaviour  of 
the  population  is  to  select  a  random  sample  of  finite  size  and  examine  it.  Un¬ 
fortunately  this  sample  in  most  cases  is  extremely  limited  in  size  and  makes  good 
analysis  very  difficult.  However,  there  is  no  point  in  waiting  for  sufficient  time  to 
get  a  better  sample  since  analyses  are  needed  now. 

Analysis  of  the  Sample 

The  record  of  discharges  for  a  given  gauging  site  is  used  as  the  sample . 
From  this  sample  some  particular  significant  group  of  values  of  the  discharges  is 
chosen.  It  could  be  for  instance,  the  annual  peak,  mean  or  minimum  discharge. 

In  this  discussion  the  peak  annual  discharge  values  will  be  selected  as 
the  variate  for  analysis.  The  variate  will  be  expressed  as 

x  =  maximum  annual  discharge  in  cubic  ft/sec. 

X  can  have  values  x^  x2,  x^,  . =  xi  where  x^  is  order  of 

magnitude  of  event. 

Although  the  sample  can  only  have  values  in  the  range  of  x^  X  -  xr 
it  should  be  obvious  that  in  the  parent  population  it  would  have  values  in  the  range 
xl£X  ±xn  where  n  represents  the  value  of  the  largest  discharge  in  an  infinitely 
long  time . 


If  the  frequency  of  occurrence  of  an  event  of  magnitude  x^  is  considered, 


it  is  most  easily  analysed  in  histogram  form .  The  histogram  is  plotted  for  peak 
flows  on  the  North  Saskatchewan  river  at  Edmonton  using  the  data  of  tables  1-3. 


1 


TABLE  I 

MAXIMUM  ANNUAL  DISCHARGE  NORTH  SASKATCHEWAN  RIVER  AT 


EDMONTON  1911 

-  1957 

Year 

Month 

Q  in  cusecs 

Year 

Month 

Que  in  sucecs 

1911 

July 

51,442 

1934 

June 

28,100 

1912 

July 

74,100 

1935 

July 

46,300 

1913 

August 

32,600 

1936 

April 

40,400 

1914 

June 

61,740 

1937 

July 

31,500 

1915 

July 

185,560 

1938 

July 

40,000 

1916 

June 

58,800 

1939 

June 

30,200 

1917 

May 

65,597 

1940 

April 

35,700 

1918 

June 

35,347 

1941 

June 

26,720 

1919 

June 

19,885 

1942 

July 

42,250 

1920 

May 

57,220 

1943 

April 

44,020 

1921 

May 

24,888 

1944 

June 

121,970 

1922 

August 

25,760 

1945 

May 

20,940 

1923 

June 

84,100 

1946 

June 

44,730 

1924 

July 

27,500 

1947 

June 

28,600 

1925 

August 

75,800 

1948 

May 

65,440 

1926 

September 

58,700 

1949 

July 

32,680 

1927 

June 

40,400 

1950 

June 

50,330 

1928 

July 

61,200 

1951 

May 

39 *  020 

1929 

June 

38,100 

1952 

June 

109,700 

1930 

July 

23,700 

1953 

June 

44,900 

1931 

July 

39,200 

1954 

June 

106,600 

1932 

June 

66,000 

1955 

June 

30,380 

1933 

June 

34,400 

1956 

1957 

June 

June 

25,460 

21,820 

■; 


: 


20 


TABLE  2 

MAXIMUM  ANNUAL  DISCHARGE  NORTH  SASKATCHEWAN  RIVER 
(in  Cusecs)  AT  EDMONTON  IN  ORDER  OF  MAGNITUDE 


1. 

19,885 

June/ 19 19 

19. 

35,700 

April/40 

37. 

61,740 

June/ 14 

20 

20,940 

May/45 

20. 

38,100 

June/29 

38. 

65,440 

May/48 

3. 

21,820 

June/57 

21. 

39,020 

May/ 51 

39. 

65,597 

May/ 17 

4. 

23,700 

July/30 

22. 

39,200 

July/31 

40. 

66,000 

June/32 

5. 

24,888 

May/21 

23. 

40,000 

July/38 

41. 

74,100 

July/ 12 

6. 

25,460 

June/56 

24. 

40,400 

June/ 27 

42. 

75,800 

Aug/ 25 

7. 

25,760 

Aug/  22 

25. 

40,400 

April/  36 

43. 

84,100 

June/23 

8. 

26,720 

June/41 

26. 

42,250 

July/42 

44. 

106,600 

June/ 54 

9. 

27,500 

July/  24 

27. 

44,020 

April/43 

45. 

109,700 

June/52 

10. 

28,100 

June/ 34 

28. 

44,730 

June/46 

46. 

121,970 

June/44 

11. 

28,600 

June/ 47 

29. 

44,900 

June/53 

47. 

185,560 

July/ 15 

12. 

30,200 

June/ 39 

30. 

46,300 

July/  35 

13. 

30,380 

June/55 

31. 

50,330 

June/50 

14. 

31,500 

July/37 

32. 

51,442 

July/ 11 

15. 

32,600 

Aug/ 13 

33. 

57,220 

May/20 

16. 

32,680 

July/49 

34. 

58,700 

Sept/26 

17. 

34,400 

June/33 

35. 

58,800 

June/ 16 

18. 

35,347 

June/ 18 

36. 

61,200 

July/28 

J  =  2, 349,799  cusecs 
x  =  49,996  cusecs 


median  =  40,400  cusecs 


21 


TABLE  3 

FREQUENCY  OF  DISCHARGES  PER  5,000  CUSECS  INTERVAL 

Class  Interval  Number 

in  cusecs  Observed 


15,000 

-  20,000 

1 

20,000 

-  25,000 

4 

25,000 

-  30,000 

6 

30,000 

-  35,000 

6 

35,000 

-  40,000 

5 

40,000 

-  45,000 

7 

45,000 

-  50,000 

1 

50,000 

-  55,000 

2 

55,000 

-  60,000 

3 

60,000 

-  65,000 

2 

65,000 

-  70,000 

3 

70,000 

-  75,000 

1 

75,000 

-  80,000 

1 

80,000 

-  85,000 

1 

85,000 

-  90,000 

0 

90,000 

-  95,000 

0 

above 

100,000 

4 

<5 


4 


TABLE  4 

FREQUENCY  OF  DISCHARGES  PER  10,000  CUSECS  INTERVAL 


Class  in  Cusecs 


0  - 

10,000  - 
20,000  - 
30,000  - 
40,000  - 
50,000  - 
60,000  - 
70,000  - 
80,000  - 
90,000  - 
100,000  - 
116,000  - 
120,000  - 
180,000  - 


10,000 

20,000 

30,000 

40,000 

50,000 

60,000 

70,000 

80,000 

90,000 

100,000 

110,000 

120,000 

130,000 

190,000 


Number 

Observed 

0 

1 

10 

11 

8 

5 

5 

2 

1 

0 

2 

0 


CHARGE  IN  THqi 


It  appears  from  the  histogram  as  it  is  plotted  in  Figure  5  that  none  of 
the  more  common  statistical  curves  will  fit  the  given  frequency  distribution.  The 
distribution  is  very  skew  with  a  long  tail  to  the  right  and  practically  none  to  the 
left. 

Histogram  of  the  Logarithms  of  X 

The  interval  of  variate  used  in  the  histogram  of  Figure  1  is  5,000  cusecs. 
The  logarithms  of  successive  intervals  were  taken  and  a  new  histogram  was  plotted 
in  Figure  6.  This  distribution  is  not  as  skew  as  the  original  distribution. 

Change  of  Class  Interval 

The  original  class  interval  of  5,000  cusecs  was  chosen  as  a  convenient 
interval  to  reduce  data  for  rapid  calculations.  This  interval  may  give  irregularity 
when  plotting  the  frequency  distribution.  Another  class  interval  might  give  another 
shape  and  this  possibility  was  investigated. 

Table  4  shows  the  frequency  distribution  with  a  class  interval  of  10,000 
cusecs.  The  histogram  of  Figure  7  shows  a  distribution  which  can  be  fitted  more 
easily  with  a  smoothing  curve . 

The  logarithmic  class  intervals  were  plotted  as  a  histogram  and  can  be 
seen  in  Figure  8.  The  smoothing  curve  was  drawn  by  eye .  This  distribution  looks 
fairly  symmetrical  and  indicates  that  a  log  normal  distribution  curve  would  fit  the 


actual  distribution. 


.r 


APPENDIX  2 


DERIVATION  OF  MEAN  AND  STANDARD  DEVIATION  OF  LOG  NORMAL 
DISTRIBUTION  FROM  MEAN  AND  STANDARD  DEVIATION  OF  THE  SAMPLE 

Log  Normal  Distribution 

We  have  the  original  variate  X  which  we  will  consider  as  having  a  log 
normal  frequency  distribution. 


Let  Y  =  loge  X 


Now  we  can  say  Y  has  a  normal  frequency  distribution. 


g(y) 


=  probability  density  for  Y 


g(y)dy  =  probability  that  y  t  Y  -  y  +  dy 


g(y) 


1 


2 


J Tfr 


where  =  mean  of  Y  distribution 


CT  =  standard  deviation  of  Y  distribution 


y  =  log  x 


x 


dy  =  dx 
x 


Note  that  x  can  only  be  positive. 

Let  X  i  be  the  4th  moment  about  origin  for  X 

J  J 


CT/2TT 


1 


e 


jy  -  (y  -  /^)2 

2  0  ^  (Jy 

ejy  -  (y  -a)2 

2  CT  2  dy- 


/<;  A 


'  \ 
■.  J 


:  > 


"(.  -  ■:>-  . 


^9 


which  can  be  integrated  by  a  change  of  variable  giving 


A'i 

=  ei^ei2  cr  2 

2 

Mean  of  X  =  x  =  /\ 

'  cr2 

x  =  e 

or  loge  x  =/ 

+  cr2  . 

2 

Variance  of  X 

=  (S.d.)2 

*  ,  *  9 

~  ^  2  (71  ^ 

=  e2^+2(72  _  +(J 

2 

=  e2-^^2  (e^-X) 

=  (^  '  )2  (e°" 2  -  1  ) 

(s.d.)2 

=  (x)2  (e0-2  -  1) 

or  ( e  2  -  1) 

=  (s.d.)2 
(x)2 

e<^2 

=  <S-d*/  +  1 
(x) 

cr2 

=  loge  c  (s.d.)2  +  j_J 

(X)  2  J 

Restating  equations  (1)  and  (2)  we  have 

rt  +  cr2 

2 

=  loge  X 

a) 

<J"  2 

=  logeF  / s.d.)  2  +  1? 

(2) 

L  \  x  )  J 

(2) 


\ 

) 


■>  '(•  ’  ) 

■■■)  :^>  ■  ■. ') 

■'.«)  :  :  -  ) 


*  \ 
V '  / 


\ 

) 


/ 


f 

\ ;  - ) 


Since  we  can  calculate  the  mean  and  standard  deviation  of  the  observed 


distribution  i.e.  x  &  s.d.,we  can  calculatey^and  <J for  the  log  normal 
distribution. 

Calculation  of  Mean  and  Standard  Deviation  of  Observed  Distribution 


The  mean  of  the  actual  distribution  was  calculated  by 
=  49,996  cusecs. 


r 

x  =  1  =  1 


The  standard  deviation  of  the  actual  distribution 

r  2  i  1/2 


=  s 


,  (Xi  -  X) 


i  =  1 


This  calculation  is  carried  out  in  tabular  form  in  Table  5 


1  ;  ■ 


.  \ 
i  / 


; 


TABLE  5 


31 


TABULAR  CALCULATION  OF  MEAN  &  STANDARD  DEVIATION  OF  SAMPLE 
USING  DATA  OF  NORTH  SASKATCHEWAN  RIVER  AT  EDMONTON 


|  Xj  -  X  /  (Xj  -  x)2x  10  ^ 

*1 

1  Xj  -  X  /  (Xj  - 

x)2xl06 

X1 

19,885 

30,111 

906.7 

x25 

40,400 

9,596 

92.1 

x2 

20,940 

29,056 

844.3 

x26 

42,250 

7,746 

60.0 

x3 

21,820 

28,176 

793.9 

x27 

44,020 

5,976 

35.7 

X4 

23,700 

26,296 

691.5 

x28 

44,730 

5,266 

27.7 

x5 

24,888 

25,108 

630.4 

x29 

44,900 

5,096 

26.0 

*6 

25,460 

24,536 

602.0 

x30 

46,300 

3,696 

13.7 

x7 

25,760 

24,236 

587.4 

X31 

50,330 

334 

.1 

x8 

.26,720 

23,276 

541.8 

x32 

51,442 

1,446 

2.1 

*9 

27,500 

22,496 

506.1 

x33 

57,220 

7,224 

52.2 

x10 

28,100 

21,896 

479.4 

x34 

58,700 

8,704 

75.8 

X11 

28,600 

21,396 

457.8 

x35 

58,800 

8,804 

77.5 

x12 

30,200 

19,796 

391.9 

x36 

61,200 

11,204 

125.5 

x13 

30,380 

19,616 

384.8 

x37 

61,740 

11,744 

137.9 

x14 

31,500 

18,496 

342.1 

x38 

65,440 

15,444 

238.5 

x15 

32,600 

17,396 

302.6 

x39 

65,597 

15,601 

243.4 

X16 

32,680 

17,316 

299.8 

x40 

66,000 

16,004 

256.1 

x17 

34,400 

15,596 

243.2 

X41 

74,100 

24,104 

581.0 

x18 

35,347 

14,649 

214.6 

x42 

75,800 

25, 804 

665.8 

x19 

35,700 

14,296 

204.4 

x43 

84,100 

34,104 

1,163.1 

x20 

38,100 

11,896 

141.5 

x# 

106,600 

56,604 

3,204.0 

X21 

39,020 

10,976 

120.5 

x45 

109,700 

59,704 

3,564.5 

x22 

39,200 

10,796 

116.6 

x46 

121,970 

71,974 

5,167.3 

x23 

40,000 

9,996 

99.9 

x47 

185,560 

135,564 

18,377.6 

x24 

40,400 

9,596 

92.1 

2,349,799  1,008,747 

44,182.9 

1  J  (Xi'X) 

’  i.  I  (xi-*)2] 

r  J 


x  =  49,996 
"'2  =  940.06  x  106 


1/2 


30,670  =  s.d. 


Thus  for  the  actual  distribution 
x  =  49,996  cusecs 
s.d.=  30,670  cusecs 


Hence  C~ 


log, 


Kir)  -] 


log p  I"  30,670  +  1 


49,996 


1 


loge  (1  +0132)  )  =  loge  1.377 


=  0.319 

C T  =  (0.319)  1/2  =  0.564 
=  loge  1.744 


/l  =  log  x 


cr 


=  loge  49,996  -  0.158 


=  10.82088  -  0.158 
=  10.66288 


=  loge  42,320 


APPENDIX  3 


CHI  SQUARE  TEST  FOR  LOG  NORMAL  FIT  OF  ANNUAL  MAXIMUM 
DISCHARGE  NORTH  SASKATCHEWAN  RIVER 


Chi-Square  Test 

One  test  for  the  goodness  of  fit  of  a  theoretical  curve  is  the  Chi-square 
test.  The  grouping  of  Table  4  was  rearranged  so  that  small  frequency  groups  at 
the  left  end  of  the  distribution  were  lumped  together .  This  was  done  to  facilitate 
calculation . 


Probability  that  x-X-x  +  dx  =  f(x)dx 

where  y  =  log  x 

dy  =  dx 
x 

and  probability  that  yf-Y-y  +  dy  =  g(y)  dy  = 


c r  y 2  ii 


Therefore  f(x)  dx 


g(y) 


dx 


and  frequency 


J  2  ir 

r  f  (x)  dx 


-  (y  -s*Y 
2  c r  2 


dx 

x 


where  r  =  number  of  observations 


Therefore 


“  (9  V# 


2cr 


X  fE 


dx 

x 


(y  zaY 

2  CT 


cry 2  u 

The  data  was  put  into  ten  groups  and  can  be  seen  in  Table  6  under 
serial  number  1-10  inclusive.  For  the  Chi-square  test  there  are  therefore 
10  -  3  =  7  degrees  of  freedom  since  the  values  of  total  frequency,  mean,  and 


standard  deviation  were  fixed. 


V  2  =  (f  -f  )2  =  2.824 
X  s  K  o  1c) 


s  )  ■ 


a  - 


;  > 

■ 


34 


TABLE  6 

TABULAR  CALCULATION  OF  CHI  SQUARE  TEST  FOR  LOG  NORMAL  FIT  OF  ANNUAL  MAXIMUM 
DISCHARGE  OF  NORTH  SASKATCHEWAN  RIVER  AT  EDMONTON 


-(SzA 

2 

I 

Serial 

dxx  10° 

X 

log10x 

y=ioge 

ly  yA 

e  2 

e  2  <T 
cr  fzr 

r 

r 

X 

fc 

f 

o 

(fo  "  fc)2 

@o-£c>: 

fc 

1 

Oto  20 

10,000 

4.0000 

9.2103 

1.453 

3.31 

0.037 

0.02t 

94.0 

2.47 

1 

1.47 

2.17 

0.880 

2 

20  to  30 

25,000 

4.3979 

10.1265 

0.536 

0.450 

0.638 

0.454 

i 

18.80 

8.54 

10 

1.46 

2.14 

0.254 

3 

30  to  40 

35,000 

4.5441 

10.4632 

0.195 

0.061 

0.942 

0.67C 

) 

13.43 

9.00 

11 

2.00 

4.00 

0.444 

4 

40  to  50 

45,000 

4.6532 

10.7144 

0.052 

0.004 

1.00 

0.7L 

i 

10.45 

7.44 

8 

0.56 

0.31 

0.042 

5 

50  to  60 

55,000 

4.7404 

10.9152 

0.252 

0.101 

0.913 

0.65C 

) 

8.55 

5.56 

5 

0.51 

0.26 

0.047 

6 

60  to  70 

65,000 

4.8129 

11.0821 

0.419 

0.279 

0.758 

0.53< 

) 

7.24 

3.90 

5 

1.10 

(  1.22 

0.313 

7 

70  to  80 

75,000 

4.8751 

11.2253 

0.562 

0.494 

0.610 

0.434 

t 

6.28 

2.75 

2 

0.75 

0.51 

0.188 

8 

80  to  90 

85,000 

4.9294 

11.3503 

0.687 

0.739 

0.477 

0.33< 

} 

5.54 

1.88 

1 

0.87 

0.76 

0.406 

9 

90  to  110 

100,000 

5.0000 

11.5129 

0.850 

1.13 

0.323 

0.23( 

) 

9.40 

2.16 

2 

0.16 

0.03 

0.052 

10 

110  to  190 

150,000 

5.1761 

11.9184 

1.256 

2.46 

0.085 

0.06: 

L 

25.1 

1.52 

2 

0.54 

0.29 

0.198 

45.22  47  2.824 


2d2  =  0.638 

=  10.66288 
1  =  0.710 

O'  fT rr 


'  O  •  -  •  v  0 


o 

J 


Using  Table  III,  page  414  of  ’’Mathematics  of  Statistics”  Part  2  by 


i 


Kenney  and  Keeping  (Van  Nostrand  1951)  the  probability  of  a  deviation  greater 
than  J C  -  2.824  for  7  degrees  of  freedom  =  0.90. 

This  value  (0.90)  indicates  that  the  log  normal  curve  is  a  good  fit  for 
the  data  since  the  limiting  probability  as  quoted  by  the  above  mentioned  reference 


> 


0.05. 


APPENDIX  4 


INSTRUCTION  FOR  FITTING  LOG  NORMAL  LINE  TO  DATA 

(Illustrated  by  North  Saskatchewan  River  Data) 

Plotting  Data 

The  data  should  be  processedby  a  systematic  method  which  will  minimize 
errors  and  facilitate  checking.  The  discharges  are  listed  in  order  of  magnitude  and 
the  percentage  of  time  equalled  or  exceeded  is  calculated  for  each  discharge.  If 
there  are  r  observed  discharges  the  percentage  of  time  equalled  or  exceeded  is 
calculated  by  the  formula 


%  of  time 

equalled  =  ^ -  x  100% 

or  exceeded  r 

where  m  =  rank  of  event. 

The  data  for  the  North  Saskatchewan  River  at  Edmonton  are  shown  in 


Table  7.  The  points  are  plotted  on  log  probability  graph  paper  as  shown  in  Figure  9. 
Calculation  of  Best  Fitting  Line 

The  points  of  Figure  1  could  be  fitted  by  an  infinite  number  of  lines  if  the 
best  fit  was  judged  by  eye.  However,  it  is  possible  to  calculate  the  best  mathematical 
fit  using  the  formulae  of  Appendix  2. 


Step  1 


Calculate  the  mean  and  standard  deviation  of  the  sample 
r 


H  xi 


1 


i 


mean  =  x 


r 


r 


1/2 


standard  deviation  =  s.d. 


i  =  1 

These  calculations  should  be  done  in  tabular  form  as  in  Table  5 .  Note 
that  the  data  of  the  North  Saskatchewan  River  at  Edmontonare  completely  processed 


as  an  example. 


TABLE  7 


37 


FREQUENCY  ANALYSIS  -  NORTH  SASKATCHEWAN  RIVER  AT  EDMONTON 


Maximum  Flows  -  Percentage  of  Time  Equalled  or  Exceeded 


Discharge 
in  cfs. 

%  of  Time 
Equalled  or 
Exceeded 

Discharge 
in  Cfs. 

%  of  Time 
Equalled  or 
Exceeded 

(1) 

19,885 

97.8 

(24) 

40,400 

50.0 

(2) 

20,940 

95.9 

(25) 

40,400 

50.0 

(3) 

21,820 

93.7 

(26) 

42,250 

45.9 

(4) 

23,700 

91.7 

(27) 

44,020 

43.8 

(5) 

24,888 

89.6 

(28) 

44,730 

41.6 

(6) 

25,460 

87.5 

(29) 

44,900 

39.6 

(7) 

25,760 

85.5 

(30) 

46,300 

37.5 

(8) 

26,720 

83.4 

(31) 

50,330 

35.4 

(9) 

27,500 

81.4 

(32) 

51,442 

33.3 

(10) 

28,100 

79.2 

(33) 

57,220 

31.3 

(11) 

28,600 

77.1 

(34) 

58,700 

29.2 

(12) 

30,200 

75.0 

(35) 

58,800 

27.1 

(13) 

30,380 

72.9 

(36) 

61,200 

25.0 

(14) 

31,500 

70.8 

(37) 

61,740 

22.9 

(15) 

32,600 

68.8 

(38) 

65,440 

20.8 

(16) 

32,680 

66.7 

(39) 

65,597 

18.7 

(17) 

34,400 

64.6 

(40) 

66,000 

16.7 

(18) 

35,437 

62.5 

(41) 

74,100 

14.6 

(19) 

35,700 

60.4 

(42) 

75,800 

12.5 

(20) 

38,100 

58.4 

(43) 

84,100 

10.4 

(21) 

39,020 

56.3 

(44) 

106,600 

8.3 

(22) 

39,200 

54.2 

(45) 

109,700 

6.3 

(23) 

40,000 

52.2 

(46) 

121,970 

4.2 

(47) 

185,560 

2.1 

99.99  99.9  99.8  99.5  99  98  95  90  80  70  60  50  40  30  20  10  5  2  1  0.5  0.2  0.1  0.05  0.01 


3  8 


0.01  0.05  0.1  0.2  0.5  1  2  5  10  20  30  40  50  60  70  80  90  95  98  99  99.5  99.8  99.9  99.99 


Step  2 

Calculate  CT with  the  formula 


Note  here  that  will  lie  in  the  range  (loge  1)  ^  CT  -  (loge  2) 

i.e.  0  f  ^  -  0.831 

Step  3 

2 

Calculate  with  the  formula ^  =  loge  x  -  ^ 

Note  that  steps  2  and  3  have  been  carried  out  in  Appendix  2 . 

Step  4 

Change/^  to  a  discharge  by  taking  antiloge . 

For  example  of  North  Saskatchewan  River 
/l  =  10.66288 
=  loge  42,320 

Plot  the  discharge  as  the  ordinate  at  the  50%  time  abscissa.  See  Figure  10. 

Step  5 

Plot  ( T  interval  on  the  graph. 

Since  the  scale  of  discharges  is  logarithmic  it  will  be  easier  to  measure  the 
interval  in  inches  than  to  work  out  the  discharges  at  the  ends  of  the  interval. 

Table  8  gives  the  interval  for  values  of  CT  (as  calculated  in  step  2)  from  0.30 
to  0.80.  This  interval  is  to  be  laid  off  on  log  probability  paper  at  the  16%  and 
84%  abscissa  as  shown  on  Figure  10.  The  line  drawn  through  points  a  b  c 
as  shown  will  be  the  best  fit.  For  the  North  Saskatchewan  River  at  Edmonton 


C T  =  0.564  (calculated  in  Appendix  2). 


TABLE  8 


40 


INTERVAL 


Calculated  CT 

CT  Interval 
in  inches 

Calculated  CT 

CT  Interval 
in  inches 

0.30 

15/32 

0.56 

7/8 

0.31 

15/32 

0.57 

7/8 

0.32 

15/32 

0.58 

7/8 

0.33 

1/2 

0.59 

29/32 

0.34 

1/2 

0.60 

29/32 

0.35 

17/32 

0.61 

15/16 

0.36 

17/32 

0.62 

15/16 

0.37 

9/16 

0.63 

31/32 

0.38 

9/16 

0.64 

31/32 

0.39 

19/32 

0.65 

1 

0.40 

5/8 

0.66 

1 

0.41 

5/8 

0.67 

1  1/32 

0.42 

5/8 

0.68 

1  1/32 

0.43 

21/32 

0.69 

1  1/16 

0.44 

11/16 

0.70 

1  1/16 

0.45 

11/16 

0.71 

1  3/32 

0.46 

23/32 

0.72 

1  1/8 

0.47 

23/32 

0.73 

1  1/8 

0.48 

3/4 

0.74 

1  5/32 

0.49 

3/4 

0.75 

1  5/32 

0.50 

25/32 

0.76 

1  5/32 

0.51 

25/32 

0.77 

1  3/16 

0.52 

13/16 

0.78 

1  3/16 

0.53 

13/16 

0.79 

1  7/32 

0.54 

27/32 

0.80 

1  7/32 

0.55 

27/32 

' 


•  ' 

", 


99.99  99.9  99.3  99.5  99  98  95  90  80  70  60  50  40  30  20  10  5  2  1  0.5  0.2  0.1  0.05  0.01 


4  I 


0.01  0.05  0.1  0.2  0.5  1  2  5  10  20  30  40  50  60  70  80  90  95  98  99  99.5  99.8  99.9  99.99 


APPENDIX  5 


4k! 


INSTRUCTIONS  FOR  FITTING  CONFIDENCE  BANDS  TO 
THE  BEST  LOG  NORMAL  FIT 

(Illustrated  by  data  of  North  Saskatchewan  River  at  Edmonton) 

Sampling  Distribution 

The  record  of  river  discharges  has  been  considered  as  a  sample  of  the 
parent  population  which  is  of  infinite  proportions.  For  the  sample  considered  it 
is  possible  to  calculate  various  statistics  such  as  the  mean  and  standard  deviation. 
If  another  sample  were  chosen  it  would  have  its  own  mean  and  standard  deviation 
which  would  probably  be  different  than  the  first  sample.  Thus  if  many  different 
samples  were  chosen  we  would  have  a  distribution  of  the  statistic  which  will  be 
referred  to  herein  as  a  sampling  distribution.  This  sampling  distribution  will  be 
considered  as  a  distribution  which  is  obtained  under  random  conditions .  It  can  be 
shown  with  the  Central  Limit  Theorem  (Keeping  and  Kenney,  Part  2)  that  as  the 
number  of  samples  tends  to  infinity  a  sampling  distribution  of  means  approaches 
the  distribution  function  of  the  normal  law.  The  standard  deviation  of  the  sampling 
distribution  will  be  called  the  standard  error. 

Calculation  of  Standard  Errors  of  Percentiles  for  a  Normal  Sampling  Distribution 

The  method  of  calculation  of  the  standard  error  of  a  percentile  can  be 
found  in  "An  Introduction  to  the  Theory  of  Statistics"  by  Yule  and  Kendall,  Griffin 
&  Co.,  Fourteenth  Edition,  Section  8.22  to  8.24. 

Standard  Error  of  (100  p  )  percentile  =  s.d. 


4J 

where  s.d.  =  standard  deviation  about  the  mean  variate  in  the  sample 
p  =  probability  (i.e.)  proportion  of  time  equalled  or  exceeded 
q  =  1-p 

Yp  =  ordinate  of  the  normal  curve  for  given  value  of  p. 

As  before  it  is  necessary  to  scale  the  values  on  the  logarithmic  graph  paper  and 
thus  one  must  use  Table  8  to  find  the(T  interval  in  inches.  It  is  then  possible  to 
express  the  standard  errors  of  the  various  percentiles  (percentages  equalled  or 
exceeded)  in  inches  and  thus  scale  them  off  on  either  side  of  the  best  fit  line. 

Table  9  gives  the  factors  to  convert  CT  intervals  to  standard  error 
intervals  using  the  CT  intervals  of  Table  8. 


:  : 


\ 

) 


TABLE  9 


STANDARD  ERROR  INTERVAL 


%  of  time  equalled 
or  exceeded 

Standard  error  interval 

in  inches 

1%  and  99% 

CT  interval 

y  Q  7A 

/r 

"  X  o  •  /  o 

2%  and  98% 

O'  interval 

y  9  on 

y? 

X  Z  ,7U 

5%  and  95% 

<J  interval 

x  2  11 

R 

10%  and  90% 

CT  interval 

-u-  1  71 

r 

X  I  .  /  1 

20%  and  80% 

CT  interval 

Y  1  4Q 

R 

X  J.  • 

30%  and  70% 

(J  interval 

,,  1  on 

R 

X  1  .  OZ 

40%  and  60% 

CT  interval 

x  1.27 

R 

50% 

CT  interval 

x  1.25 

R 

Note:  r  =  number  of  discharges  in  sample 


■ 


' 


w 


...  ■ 


99.99  99.9  99.8  99.5  99  98  95  90  80  70  60  50  40  30  20  10  5  2  1,  0.5  0.2  0.1  0.05  0.01 


45 


1 1 


0.01  0.05  0.1  0.2  0.5  1  2  5  10  20  30  40  50  60  70  80  90  95  98  99  99.5  99.8  99.9  99.99 


An  example  of  the  calculation  of  the  standard  error  intervals  for  the 
North  Saskatchewan  River  at  Edmonton  annual  maximum  flows  follows: 


%  equalled 
or  exceeded 

Standard  error 
interval  in  inches 

50% 

0.160”  =  5/32" 

40%  and  60% 

0.163" =  5/32" 

30%  and  70% 

0.169" =  5/32" 

20%  and  80% 

0.183"  =  3/16" 

10%  and  90% 

0.219"  =  7/32" 

5%  and  95% 

0.252"  =  1/4" 

2%  and  98% 

0.372"  =  3/8" 

1%  and  99% 

0.481"  =  15/32" 

where  r  =  47,  (J~  interval  =  7/8"  =  0.875"  andGy/r  =  0.128 

The  intervals  were  plotted  on  Figure  11  and  smoothing  curves  were 
drawn.  These  smoothing  curves  enclose  the  confidence  band  of  68%.  In  other 
words,  the  discharge  will  fall  within  this  area  68%  of  the  time.  Other  confidence 
bands  could  be  constructed  in  a  similar  manner .  The  confidence  band  for 
2  (  CT  interval)  can  be  calculated  by  Table  9  and  multiplying  the  results  by  2. 

This  band  estimates  the  range  of  discharges  which  will  take  place  95%  of  the  time. 
It  is  recommended  that  the  68%  and  95%  confidence  bands  be  used.  It  can  be 
seen  that  all  the  discharges  from  the  North  Saskatchewan  River  fall  within  these 
confidence  bands. 

Note  that  the  method  used  for  best  fit  and  confidence  band  methods 
described  herein  can  only  be  used  for  the  Hazen  Whipple  &  Fuller  Logarithmic 
Probability  Paper  of  the  same  scale  and  dimensions  as  used  throughout  this  thesis. 


LIST  OF  REFERENCES 


1.  Blench,  T. ,  1957  -  Regime  Behaviour  of  Canals  &  Rivers.  Butter  worths. 

2.  Carter,  R.  W.,  1951  -  Floods  in  Georgia  -  Frequency  and  Magnitude 

Geological  Survey,  Circular  100,  U.S.  Dept,  of  Interior. 

3.  Gumbel,  1942  -  Statistical  Control  Curves  for  Flood  Discharges, 

Trans.  Amer.  Geophys.  Union.,  Vol.  23,  pt.2,  pp.  489-503. 

4.  Johnston,  D.,  Cross,  W.  P.  1949  -  Elements  of  Applied  Hydrology, 

Ronald  Press, 

5.  Kenney,  J.  T. ,  Keeping,  E.  S»,  1951  -  Mathematics  of  Statistics, 

D.  Von  Nostrand. 

6.  Kuiper,  E.,  1957  -  100  Frequency  Curves  of  North  American  Rivers, 

Proco  Amer.  Soc.  Civil  Engrs.,  Vol.  83,  No.  HY5,  paper  1395. 

7.  Linsley,  R.  K.,  Kohler,  M.  A.,  Paulus,  J.  L.,  1949, 

Applied  Hydrology,  McGraw  Hill. 

8.  Merriman,  M.,  1910  -  Method  of  Least  Squares.  Wiley. 

9.  Subcommittee  of  the  Joint  Division  Committee  on  Floods,  1953  -  Review  of 

Flood  Frequency  Methods.  Trans.  Amer.  Soc.  Civil  Engrs. 

Vol.  118,  pp.  1220  -  1230. 

10.  Ven  Te  Chow,  1954  -  The  Log  Probability  Law  and  Its  Engineering  Applications. 
Proc.  Amer.  Soc.  Civil  Engrs.,  Vol.  80,  Separate  No.  536. 


'V 


B29780 


